Like holding hands with the person you love, happiness is a simple unforgetable moment in our lives. But is everyone experiencing happiness the same or drastically different?

With the technology that can help us understand how people describe their happiness, we present the following analysis to visualize and understand the science behind happiness by using deeper NLP methdologies in positive psychology. I want to specially thank Arpita Shah and Tian Zheng for providing coding resources for this R notebook.

Happiness is when what you think, what you say, and what you do are in harmony. ---Gandhi

Happiness is when what you think, what you say, and what you do are in harmony. —Gandhi

INTRODUCTION

This project happily unites science and art. The study of happiness is an area of positive psychology that studies the factors that sustain people’s happiness in their lives. What are some factors we can associate ourselves with when we are study the science about happiness? Is it money? Is it marital status? Or is it material belongings?

Background

Asai et al (2018) proposed a data set and outlines a few NLP problems that can be studied with. HappyDB is a corpus of 100,000 crowd-sourced happy moments via Amazon’s Mechanical Turk. Please refer to Asai et al (2018) which can be accessed on https://arxiv.org/abs/1801.07746. We explore this data set and try to answer the question, “What makes people happy?”

EXPLORATORY ANALYSIS

This section we dive into the analysis and technical part of the project.

Let us take a quick preview of what the data set looks like.

## Warning in instance$preRenderHook(instance): It seems your data is too
## big for client-side DataTables. You may consider server-side processing:
## https://rstudio.github.io/DT/server.html

WHAT CONSISTS OF HAPPINESS

When we are happy, what sort of memory do we associate ourselves with? Are we feeling happy when we eat delicious food? Or are we feeling happy when we are with friends and family? To start to answer this kind of question, we dive into the data set by the text input of HappyDB data.

A global view is to see what are some high frequency words in the data set when people are asked to recall their past moments of happiness. We can do a word cloud on all the text inputs choosing top 50 words.

What Words Do We Say

The top qords can further be notated and analyzed by taking a look at the exact frequency. We have top words to be ``friend’’. There is an old saying misfortune tests the sinceretry of friends. Well, so does statistics, right?

Is Happiness Different Between Genders

In kindergarten, people generally associate boys with guns and girls with barbies. Do gender truly reveal a difference when people associate themselves with happy memories. We can compose a scatter plot of the frequency of words each gender associate themselves with when asked to bring up a happy memory. In this case, the scatter has “female” on the x-axis and “male” on the y-axis. We can see that a typical piece of information is “basketball” which happen to have a larger frequency for “male” and “female”. A notable example is “makeup” for “female” but not so much for “male”.

Married or Single

Is marriage an important factor to differentiate memories of happiness? We can do a scatter plot of frequency of words according to “married” and “single”. In the following plot, we have “married” on the x-axis and “single” on the y-axis. Intuitively, “child” happen to be a word that associate with “married” people when they find themselves in happiness.

Does Culture Have Nationality

Will happiness change in different country? We can think about this kind of question using United States as an example. We can create an indicating variable telling us whether a country is United States or not. We can then do a boxplot on age. For United States, people who expressed happy moments in the text have a wider age distribution than that of the rest of the world. We can also take a look at the violin plot with a kernel smoothing technique to see a better view of the distribution.

To further this direction of analysis, we can take a look at the word cloud for people who expressed happiness in USA and that of not in USA. We plot USA first and rest of the world next. Both graphs are presented below. We can see top 3 words USA has similar results as the rest of the world. For example, “time”, “friend”, “day” it is similar around the world. Then we have differences such as in USA some of the top frequency words are “find”, “time”, “played” are not in the rest of the world.

Let us peel more skins off the onions. That is, let us introduce another variable, say gender, on nationality and see what the results are. The purpose to do this is to create a two-way interaction. Considering nationality and gender with each variable to be binary form. We have partitions: (USA, Male), (USA, Female), (Not USA, Male), (Not USA, Female), total four of them. Thus, we can have four distinct high-frequency word counts for them.

SUMMARY

The blog shares with readers a few exploratory results on HappyDB data set and attempts to understand the question ``what makes people happy’’. We make the following conclusions:

  1. There are certain words people associate themselves with when they are experiencing happiness. These words are fairly intuitive such as “friend”, “day”, “family”, and so on.

  2. We do observe there is a difference between gender when experiencing happiness.

  3. We also observe a difference between married and single people.

  4. We use USA as an example to see if there is a difference between USA and the rest of the world. We observe a wider age distribution in USA than that of the rest of the world. For high frequency words in the top three positions, they appear to be similar. Words start to differ among top 20 high frequency words between USA and the rest of the world.

  5. Lastly, we introduce a two-way intersection using nationality (USA or not) and gender (male or female) to represent the difference of high frequency words.

REFERENCE

Asai et al (2018), “HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments”, https://arxiv.org/abs/1801.07746.